DNA sequence of the enzyme phospholipase A1 of ciliate tetrahymena, and the use of the same

ABSTRACT

A nucleic acid coding for the phospholipase A 1  from ciliates. In particular, the phospholipase A 1  has the amino acid sequence SEQ ID No. 7.

The invention relates to a nucleic acid coding for the phospholipase A1and the use thereof according to the preamble of claims 1 to 5.

Yeasts, bacteria and mammal cells are of great importance to thebiotechnological preparation and production of recombinant activesubstances by the heterologous expression of foreign proteins. Bacterialexpression systems based on E. coli or B. subtilis are used for theproduction of recombinant peptides or proteins, such as insulin,interleukin-2, tissue plasminogen activator, proteases and lipases. InGram-negative bacteria, the expression systems are mostly based on theuse of genetic elements, such as the lac operon or the tryptophanoperon. The proteins foreign to the host are produced either into“inclusion bodies” within the cell, or when expression systems based onβ-lactamase genes are used, into the periplasmic space. The productionof recombinant proteins into the surrounding fermentation medium has notbeen established. In Gram-Positive bacteria, to date, almost exclusivelycell-inherent proteins are introduced in expression systems andexpressed.

Yeasts, such as S. cerevisiae, Hansenula polymorpha, Kluyveromyceslactis or Pichia pastoris, are also employed for the heterologousexpression of recombinant proteins, such as human factor XIIIa, bovinepro-chymosin, phytase or surface antigens. Here, the expression systemsare based on shuttle vectors (vectors having both yeast and bacterialportions) which are based (depending on the yeast species) on thegenetic elements of galacto-kinase-epimerase, methanol oxidase, acidphosphatase or alcohol-dehydrogenase. As a rule, the recombinant proteinis produced into the cytoplasm of the cell. When yeast-inherent signalsequences, such as the alpha factor, are used, the expressed proteinsmay also be secreted into the fermentation medium. The glycosylation ofsecreted proteins is effected according to the “high mannose” type, andfrequently there are hyperglycosylations on the protein which may resultin the formation of antibodies in the patient.

Mammal cells, such as various cell types from rodents (CHO cells, C127cells) or simians (vero, CV-1 or COS cells) are also employed for theheterologous expression of recombinant proteins. Here, the expressionsystems are based on recombinant viruses (BPV vector) or on shuttlevectors. To regulate the expression, viral SV40 enhancer/promotersystems or cellular enhancer elements are employed. The recombinantproteins, such as erythropoietin, are secreted into the fermentationmedium because the foreign genes usually bring their own signalsequences, which are understood by the expression system and used fortargeting.

Further, for the biotechnological production of glycosylatedextracellular enzymes, protozoans of the genus Tetrahymena are employed.Tetrahymena will grow on inexpensive fermentation media using standardfermentation methods. For the transformation of such Tetrahymena cells,vectors are available which are based on the rDNA elements ofTetrahymena. For the heterologous expression of bacterial proteins inTetrahymena, DNA constructs made from genes from Tetrahymena areemployed. When suitable genetic elements for the regulation of thetranscription, targeting and glycosylation of foreign proteins areavailable, Tetrahymena is an ideal expression system for the inexpensiveproduction of therapeutic recombinant proteins.

The Gram-negative bacterial expression systems used to date usually leadto the formation of “inclusion bodies” in the cell, accompanied by adenaturing of the proteins. To recover the recombinant protein, thecells must be lysed, and the denatured inactive protein must be foldedback to function. This causes additional cost-intensive process stepsand reduces the yield of the desired protein. Glycosylation, which isimportant to eukaryotic proteins, is completely omitted. WhenGram-positive bacterial expression systems are used, degradation of thetarget protein due to high proteolytic activities in the fermentationbroth is an additional problem.

When yeasts are used for heterologous expression, the desired targetprotein is also produced only into the cell, from where it must beremoved by cell lysis. As in bacterial expression systems, this causesadditional time- and cost-intensive process steps. When yeast-inherentsignal peptides are used, the foreign proteins are not correctly splicedand glycosylated for secretion.

In contrast, when mammal cell systems are employed for the production ofrecombinant proteins, the desired proteins are found in the fermentationmedium in an extracellular state, correctly spliced and glycosylated.However, what is disadvantageous here is, on the one hand, the lowexpression rate due to the defective processing and inefficienttranslation of genes which have been introduced into the genome of theproduction cell line via viral vectors. On the other hand, theserum-containing fermentation media for mammal cells are extremelycost-intensive. In addition, the fermentation technology for theshear-sensitive cell lines is complicated and similarly expensive due toconstructions for bubble-free aeration. Further problems arise from thehigh infection risk for the cell lines from mycoplasmas and viruses. Allin all, the use of mammal cells for the biotechnological preparation ofrecombinant proteins results in very high costs, safety demands and lowyields.

To the use of ciliates, such as Tetrahymena, the above mentioneddrawbacks in the production of recombinant proteins do not apply. Thus,for example, some acid hydrolases which are involved in the digestion offood particles are exported from the cell in high quantities and withcomplex glycosylation.

In J. Euk. Microbiol. 43 (4), 1996, pages 295 to 303, Alam et al.describe the cloning of a gene which codes for the acid α-glucosidase ofTetrahymena pyriformis. However, only a small portion of the protein isexported from the cell. Further, the International Patent ApplicationPCT/EP 00/01853 describes the gene of a β-hexosaminidase fromTetrahymena thermophila which is known, however, to be exported from thecell to only about 80%.

However, to date, it has not been possible to cause glycosylatedeukaryotic proteins to be expressed in Tetrahymena and also beexclusively secreted into the fermentation medium. This is because theDNA sequences of extracellular proteins inherent to Tetrahymena whichare necessary for the construction of expression vectors and whichexclusively export the foreign protein into the surrounding fermentationmedium have as yet been unknown. The DNA sequences of a protein whichcodes for the β-hexosaminidase of Tetrahymena thermophila are known.Such a sequence has been filed for a patent application under theofficial file numbers DE 199 58 979.8, DE 199 09 189.7 and under PCT/EP00/01853. However, there is a disadvantage of these sequences in thatthe pre/pro-peptides containing them will target a protein foreign tothe host into the surrounding fermentation medium to only about 80%.This is due to the fact that the enzyme β-hexosaminidase is present toabout 20% within the membrane under natural conditions, and only about80% of the naturally produced enzyme is exported from the cell. For thisreason, pre/pro peptides of β-hexosaminidase, when positioned in frontof a protein foreign to the host by genetic engineering methods, willtarget, only about 20% of this protein foreign to the host into thecytoplasma membrane on the surface of Tetrahymena thermophila. This isassociated with a considerable process-technological disadvantage forthe production of recombinant active substances. On the one hand, theyield is decreased because part of the expressed protein remains in thecells bound to the membrane, and thus it is not possible to purify theentire expressed protein from the fermenter broth. On the other hand,the protein foreign to the host in the cell membrane can exert toxiceffects on the host cells and thus slow down the cell growth.

Further, no constitutive promoters of Tetrahymena which cause aconsistent or continuous transcription of heterologous proteins havebeen known to date. To date, only promoters of histone and tubulin geneshave been known (Bannon et al., 1984, Gaertig et al., 1993). However, acritical disadvantage of these promoters is that their activation isdependent on the cell cycle. Genes of heterologous proteins which arelinked to such cell-cycle-dependent promoters are caused to be expressedonly in growing or dividing cells. This has considerableprocess-technological disadvantages since the desired protein is thusproduced only in the logarithmic growth phase. In the stationary growthphase in which the highest cell density and thus the highest performanceof the expression organism (Tetrahymena) is reached in the productionprocess, there is hardly any cell growth left and thus only a lowexpression of the heterologous protein takes place.

It is an object of the invention to provide a system which enables theproduction of heterologous proteins in an expression system, aftertransformation into Tetrahymena, from the cells into the fermentationmedium.

This object is achieved by a system in which a nucleic acid having thesequence SEQ ID No. 1 coding for a phospholipase A1 (SEQ ID No. 7) isemployed. Advantageously, the expression product of this DNA is exportedfrom the cell in large amounts under culturing conditions. The expressedprotein is exported into the surrounding culture medium to a high extentand is not contained in the membrane. The nucleic acid sequenceaccording to the invention contains a promoter which causes aconstitutive, i.e., cell-cycle-independent, transcription of thedownstream genes of heterologous proteins. Such constitutivetranscription has the advantage that the proteins are continuouslyexpressed by heterologous expression in the host organism without beingaffected by the cell cycle. Thus, the transcription of the foreign genecan be effected and the heterologous protein expressed also during thestationary growth phase with a low cell growth.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 sequentially depicts a nucleic acid and its components.

FIG. 2 sequentially depicts polypeptide sequences.

FIG. 3 graphically illustrates an elution profile.

FIG. 4 graphically illustrates an elution profile.

FIG. 5 graphically illustrates an elution profile.

The DNA sequence of phospholipase A₁ according to the inventionpreferably includes an upstream region of PLA₁ (SEQ ID No. 2) whichbears the promoter elements for the initiation of transcription, asignal peptide and a pro-peptide, further genetic elements for thetargeting of proteins and, in particular, a down-stream region of PLA₁(SEQ ID No. 3) which contains genetic elements for the termination oftranscription. The use of these nucleic acids in a vector enables theexpression of heterologously expressed proteins independently of thecell cycle and to transport them selectively out of the cell and intothe surrounding culture medium without expressed proteins becomingincorporated in the cytoplasma membrane, whereby such proteins can beisolated from the fermentation broth without cell lysis.

FIG. 1 shows a nucleic acid coding for the upstream region (SEQ ID No.2), the coding region (SEQ ID No. 1) and the downstream region (SEQ IDNo. 3) of phospholipase A₁ from ciliates.

FIG. 2 shows a corresponding expression product of the nucleic acidaccording to SEQ ID No. 1. The invention also relates to the proteinaccording to SEQ ID. No. 7.

In particular, the invention also relates to the signal sequence (SEQ IDNo. 6) of the protein according to the invention. Preferably, these arethe amino acids 1 to 110 of the protein according to the invention (SEQID No. 5). The invention also relates to a nucleic acid coding for theN-terminal fragment (SEQ ID No. 3). This is preferably a fragment of thenucleic acids according to the invention (SEQ ID No. 4), especiallyhaving the nucleic acid sequence 1 to 155 according to FIG. 1.

The nucleic acid sequence of the non-translated region (upstream region)(SEQ ID No. 2) upstream from the coding sequence region of the PLA₁ fromTetrahymena is positioned between position −275 and position −1(represented in lowercase letters). The established non-translatedregion comprises 275 bases. As elements of a promoter, a TATA box isfound on positions −49 to −55 (printed in boldface), and a putative CAATbox is found between base −133 and base −136 (printed in boldface). Thecoding sequence range of the cDNA is represented in capital letters. Thenumbering of the sequence begins with the start codon ATG. Regions knownfrom protein sequencing are boxed, and the stop codon is underlined. Themature protein is coded from base 331. The sequence listing from base 1to base 330 represents the pre/pro sequence (SEQ ID No. 8) of PLA₁. Thesequence listing from base 331 to base 963 is the sequence of the maturePLA₁ (SEQ ID No. 9). In position 961, there is the translation stop TGA,and in position 1039, there is the polyadenylation signal AAT AAA. Thenucleic acid sequence from position 964 to position 1134, which is belowthe coding sequence of the PLA₁ of Tetrahymena, represents thedownstream region of PLA₁ (SEQ ID No. 3) which is not translated (alsorepresented in lowercase letters). In position 964 to position 1101,there is the region known from the sequencing of the cDNA, which wasalso confirmed by inverse PCR. After transcription, the poly-A tail isattached to the last codon of the mRNA (ttt, positions 1098–1101).

A further aspect of the invention is the use of a nucleic acid sequenceof acid hydrolases according to the invention or parts thereof for thehomologous or heterologous expression of recombinant proteins andpeptides, and for homologous or heterologous recombination (“knock-out,“gene replacement”).

The invention also relates to a method for the homologous orheterologous expression of proteins and peptides and for the homologousor heterologous recombination (“knock-out, “gene replacement”) in whichciliates are transfected with a nucleic acid according to the invention.

The nucleic acids or parts thereof may be combined, in particular, withthe enhancers, promoters, operators, origins, terminators, antibioticresistances usual for the homologous or heterologous expression ofproteins, or with other nucleic acids or DNA fragments or all kinds ofsequences from viroids, viruses, bacteria, archezoans, protozoans,fungi, plants, animals or humans.

In particular, the nucleic acid according to the invention is containedin a vector, a plasmid, a cosmid, a chromosome or minichromosome, atransposon, an IS element, an rDNA, or all kinds of circular or linearDNA or RNA.

The invention also relates to a method in which the nucleic acid orparts thereof according to the invention which code for phospholipase A₁are combined with the usual, in homologous or heterologous expression,enhancers, such as the NF-1 region (a cytomegalovirus enhancer),promoters, such as the lac, trc, tic or tac promoters, the promoters ofclasses II and III of the T7 RNAP system, bacteriophage T7 and SP6promoters, aprE, amylase or spac promoters for Bacillus expressionsystems, AOX1, AUG1 and 2 or GAPp promoters (Pichia) for yeastexpression systems, RSV promoter (SV40 virus), CMV promoter(Cytomegalovirus), AFP promoter (adenoviruses) or metallothioninepromoters for mammal expression systems, Sindbis virus promoters orSemlike forest virus promoters for insect cells, promoters for insectcell expression systems, such as hsp70, DS47, actin 5C or copia,plant-specific promoters, such as 35S promoter (cauliflower mosaicvirus), amylase promoter or class I patatin promoter, operators, such asthe tet operator, signal peptides, such as a-MF prepro signal sequences(Saccharomyces), origins, terminators, antibiotic and drug resistances,such as ampicillin, kanamycin, streptomycin, chloramphenicol,penicillin, amphotericin, cycloheximide, 6-methylpurine, paromomycin,hygromycin, α-amanatin, auxotrophy markers, such as the gene ofdihydrofolate reductase, or other nucleic acids or DNA fragments, or allkinds of sequences from viroids, viruses, bacteria, archezoans,protozoans, fungi, plants, animals or humans.

In particular, the nucleic acid or parts thereof according to theinvention are inserted into a vector, a plasmid, a cosmid, a chromosomeor minichromosome, a transposon, an IS element, an rDNA, or all kinds ofcircular or linear DNA or RNA.

The skilled person will understand that nucleic acids having at least40% homology with the nucleic acid according to SEQ ID No. 1 can also beemployed according to the invention. The protein according to SEQ ID No.2 can also be modified without losing its function. Thus, for example,so-called conservative exchanges of amino acids may be performed. Thus,for example, hydrophobic amino acids can be interchanged.

For the purification and isolation of phospholipase A₁ from Tetrahymenaand for determining its sequence, the following methods can be used.

Recovery of PLA₁

PLA1 was obtained from cell-free culture supernatants of Tetrahymenathermophila. Thus, the cells were fermented in a 2 I fermenter (BiostatMD, Braun Diessel Biotech, Melsungen, Germany) which was controlled overa digital controlling unit (DCU). The fermenter was first operated for24 hours in a batch operation and then continuously. Harvesting of thecell-free culture supernatant was ensured through a perfusion modulehaving a pore size of about 0.3 μm (S6/2, Enka, Wuppertal).

The fermentation was performed under the following parameters:

-   -   the working volume was 2 liters;    -   the perfusion rate was 2 liters/day;    -   the revolutions per minute of the stirrer was limited to 800        rpm;    -   the temperature was constantly at 30° C.;    -   the pH value was kept constant at pH 7;    -   the inoculation titer was at 50,000 cells/ml.

For the fermentation, the strain SB 1868 VII was used. This is a wildtype strain of Tetrahymena thermophila.

The fermentation was performed over a period of 264 hours, and theharvests were tested for PLA₁ activity.

Purification of PLA₁

For the purification of PLA₁, 1 liter of cell-free culture supernatantfrom the fermentation was used. It was admixed with 140 g of ammoniumsulfate and concentrated through an ultrafiltration unit (Pellikon XL,exclusion size 3 kDa, Millipore) to a volume of 50 ml. Subsequently, thesample was purified by hydrophobic interaction chromatography (20×1.6Fractogel EMD Phenyl I 650, Merck, Darmstadt). The flow rate was 5ml/min, and the eluate was collected in 5 ml fractions. The enzymeactivity was measured by the deacylation of a radioactively labeledphospholipid (L-3-phosphatidylcholine, 1-palmitoyl-2-[1-¹⁴C]linoleoyl).FIG. 3 shows the elution profile obtained, the sodium acetate gradientand the enzyme activities in the individual fractions.

The three fractions having the highest enzyme activities were combinedand rebuffered into the starting buffer (Bis-Tris 20 mM, pH 6.5) foranion-exchange chromatography (AEC) by means of an ultrafiltration unit.Subsequently, the sample was charged onto the column(Q-Sepharose-Hiload-16/10, Pharmacia, Sweden), and the PLA₁ was elutedwith a linear NaCl gradient (flow rate=3 ml/min) from the column andcollected in 5 ml fractions. FIG. 4 shows the elution profile obtained,the NaCl gradient and the enzyme activities of the individual fractions.

From the fraction having the highest PLA₁ activity, 200 μl was removedand separated by size exclusion chromatography (SEC). For this purpose,a Superdex HR 75 30/10 column (Pharmacia, Sweden) was used. The flowrate in this chromatography was 0.6 ml/min, the eluate was collected in200 μl fractions. FIG. 5 shows the elution profile obtained and theenzyme activities of the individual fractions.

The fractions obtained were examined for their purity usingone-dimensional gel electrophoresis. Thus, two distinct bands wereestablished at ˜26 and ˜28 kDa. Separation of these two bands by atwo-dimensional gel electrophoresis resulted in a separation of the twobands into 2 and 3 spots, respectively, having different isoelectricpoints.

For the 26 kDa proteins, these were at pH 6.3 and 5.7, and for the 28kDa proteins, they were at pH 6.3, 5.7 and 5.3. A final examination ofthese spots by mass fingerprint analysis showed, that these spots wereisoforms of the same protein.

Molecular-Biological Examination of PLA₁

After the purity of the protein had been demonstrated, samples of theprotein were blotted onto a PVDF membrane and subjected to initialsequencing from the N terminus. In addition, a further sample wastryptically digested and also subjected to initial sequencing. Using theprotein sequences obtained thereby, oligonucleotide primers wereprepared, which were then employed in reverse transcriptase PCR (3′RACE, rapid amplification of cDNA ends). Using this PCR technique, cDNAof phospholipase A₁ was successfully amplified and subsequentlysequenced. The sequence obtained had a length of 633 bases and 729bases, respectively, and the molecular weight of the mature proteinderived therefrom is about 22.4 kDa. In the sequence derived, theoligopeptides of 22 amino acids (N-terminal) and 18 amino acids (withinthe protein) established from protein sequencing were found again to100%. In addition to the sequence of the mature protein, the sequence ofthe pre/pro peptide could also be established by means of 5′ RACE (rapidamplification of cDNA ends) (FIG. 2). This is a peptide having a lengthof 110 amino acids which bears both the signal sequence and the propeptide which inactivates the enzyme and is cleaved off only at thefinal place of activity of the enzyme.

Sequence comparisons yielded no homologies with previously knownphospholipases A₁, except for a consensus sequence of 5 amino acids(G×S×G), which is found in all lipases and phospholipases and isdiscussed as a binding site for lipids or phospholipids. Further, theupstream and downstream sequences of PLA₁ were established by inversePCR (FIG. 1). Thus, genomic DNA was cut with restriction endonucleases,ligated with T4 ligase and finally amplified with inverse primers. Forthe amplification of the upstream region of PLA₁ by inverse PCR, genomicDNA cut with the restriction endonuclease SspI was used. Thus, anupstream region of 275 bases could be established, and promoter elementsidentified. In position −136, there is a CAAT box which has a similardistance from the translation start as the CCAAT boxes of the histonegenes (−141 and −151) of Tetrahymena as found by Brunk and Sadler(1990). A TATA box, which fixes the exact starting point oftranscription in eukaryotic genes, was identified on position −55. Itssequence corresponds to the consensus sequences found in eukaryotes. Forthe amplification of the downstream region, which contains theterminator for the transcription of the PLA₁ from Tetrahymena, byinverse PCR, genomic DNA cut with BamHI was used. Thus, in addition tothe downstream region known from 3′-RACE, another 222 bases could beestablished (FIG. 3).

1. An isolated nucleic acid sequence of SEQ ID NO: 4, consisting ofcoding and non-coding regions, wherein the coding region codes forphospholipase A1 (PLA₁) from Tetrahymena thermophila (SEQ ID NO: 5), andnon-coding regions represent the upstream and downstream regions of thephospholipase A₁ gene.
 2. An isolated nucleic acid sequence of SEQ IDNO: 2 or SEQ ID NO:
 3. 3. An isolated nucleic acid sequence of SEQ IDNO: 1, SEQ ID NO: 8 or SEQ ID NO:
 9. 4. An isolated signal sequence ofthe PLA₁ (SEQ ID NO: 5) having the sequence of SEQ ID NO:
 6. 5. Anisolated PLA₁ protein having the sequence of SEQ ID NO:
 7. 6. A methodfor homologous or heterologous expression of a protein comprisingtransfecting a ciliate host cell with the nucleic acid of claim 1 andculturing the host cell under conditions allowing expression ofphospholipase A₁.
 7. The method according to claim 6, wherein thetransfected nucleic acid is contained in a vector, plasmid or a cosmid.8. A method for homologous or heterologous expression of a proteincomprising transfecting a ciliate host cell with the nucleic acid ofclaim 3 and culturing the host cell under conditions allowing expressionof phospholipase A₁.
 9. A transformed vector, plasmid or cosmid,comprising the nucleic acid of claim
 1. 10. A transformed vector,plasmid or cosmid, comprising the nucleic acid of claim 3.